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ABSTRACT 


The  paper  surveys  the  present  state  of  the  theory  of 
linear,  least  squares  prediction  of  q-variate  weakly  stationary 
stochastic  processes  with  discrete  time.  The  emphasis  is  on 
logical  order.  Hence  recent  developments  are  described  within 
the  context  of  a  general  theory  rather  than  chronologically . 
Methods  for  computing  the  predictor  are  briefly  discussed,  but 
purely  statistical  questions  such  as  the  estimation  of  covariances 


are  omitted. 
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RECENT  TRENDS  IN  MULTIVARIATE  PREDICTION  THEORY 

P.  Masani 

I.  Introduction 

From  among  the  many  facets  of  multivariate  prediction  we  will 
consider  only  the  theory  of  linear,  least  squares  prediction  of  q- 
variate,  weakly  stationary  stochastic  processes  with  discrete  time.  Our 
purpose  is  to  give  a  coherent  account  of  the  present  state  of  this 
theory.  We  shall  therefore  refer  to  recent  developments  not  in  iso¬ 
lation  but  within  the  context  of  the  general  theoretical  framework. 

Our  emphasis  will  be  on  generality  and  logical  order,  but  the  practical 
side  will  also  be  discussed  though  somewhat  briefly  ( cf.  §§2, 15) . 
Statistical  questions  of  estimation,  etc.  will  be  omitted. 

To  recall  the  problem  involved  in  such  prediction  suppose  that 
x  is  a  q-dimensional  vector  quantity  associated  with  some  long  en¬ 
during  mechanism  in  nature,  and  that  x^  denotes  its  value  at  time 
t  =  n  .  Suppose  that  we  have  been  measuring  x  every  second 
from  the  remote  past  up  to  the  present  moment  t  =  0,  and  have  so 
obtained  a  sequence  of  readings 
(1*1)  xk  =  ak,  k  =  °i  -2,  ...  . 

Is  there  some  way  to  forecast  the  future  value  xy,  v  >  on 
basis  of  the  information  contained  in  (1.1)?  Without  further  knowledge 
of  the  mechanism  our  answer  to  this  question  must  be  in  the  negative. 

If,  however,  we  assume  that  our  mechanism  is  such  that  the  sequence 

Sponsored  by  the  Mathematics  Research  Center,  United  States  Army, 
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(x^j^oe  is  part  of  a  time-sequence  ( sample -function)  of  a  q-variate 

00 

stationary  stochastic  process  { SP)  ( f  )  _  over  a  probability  space 

“Ti  n=-°° 

P) ,  so  that 

( 1.  2)  x  =  f  ( u>J  ,  e  fi,  -oo  <  n  <  oo 

~ti  — n  0*0  ’ 

and  that  we  know  the  probabilistic  structure  of  this  SP,  then  the 
answer  is  in  the  affirmative  as  we  proceed  to  indicate. 

A  A 

Denote  the  forecast  value  of  x  by  x  .  As  x  is  to  depend 

—  v  —  V  ~~  v 

on  the  past  and  present  values  x^,  k  <  0  alone,  we  must  expect 
that  x^  #xv  except  when  our  mechanism  is  purely  deterministic.  Such 
mechanisms  are  of  course  very  important,  but  they  are  only  of  per¬ 
ipheral  interest  in  the  theory  of  probabilistic  or  statistical  prediction . 


which  concerns  us  here.  In  this  theory  the  problem  is  to  find  the 


x  v  which  comes  closest  to  x^  under  some  preassigned  statistical 
error  criterion.  In  least  squares,  linear  prediction  we  adopt  the  root- 


mean-square  ( RMS)  error  criterion,  and  confine  attention  to  x 

^  ( n) 

which  are  mean  limits  of  linear  combinations  2  A.  x  ,  .  where 

k=0  k  “-k' 

A^  are  qXq  matrices  ( *)  .  It  can  be  shown  that  the  are 


determinable  and  the  problem  solvable  when  the  covariance  structure 

00 

of  the  stationary  SP  (  f  )  _00  is  known. 


To  state  the  problem  in  greater  detail,  we  are  given  a  bisequence 

/  f  \°°  _  of  q-variates 
'  -i-n,n=-oc 


-2- 
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(1.3) 


-n  *  (fn»  •••»  fn)»  where  e  L2< p)  » 


such  that  the  qXq  covariance  matrix 


(1.4) 


tE(£l 

m 


)]  =  [  V 


U.J), 

m-n  J 


=  r 

-m-n 


depends  only  on  the  difference  m-n  (  )  .  This  is  the  hypothesis  of 
weak  stationaritv.  Now  let  m q  be  the  (closed,  linear)  subspace  of 
LJft,  ft,  P)  spanned  by  the  f  *,  with  n  <  0  and  1  <  i  <  q;  in 
symbols 

(1.5)  mQ  =  n  <0,  1  <  i  <  cj)  . 


Then  our  problem  may  be  stated  as  follows: 

1.6  Prediction  Problem.  Assuming  as  known  the  covariance 

oo  -l  .q 

bisequence  (  r^)^ _ M  and  given  v>l,  find  variates  f  y ,. . . ,  f  y  c  lt\Q 

such  that 

E(U1v-f1v|2)  <  E(  If E  -  g|2),  g  c  m  0,  1<1  <  q  . 

Also  find  the  predict  ion -error  covariance  matrix 

i  a  i  ■  j  *  j 

G  =  [  E  {( f 1  -  f  )(f  J  -  f  J)]  . 

Now  L^(  ft,  ft,  P)  is  a  Hilbert  space  with  the  inner  product  pro¬ 
duct  (f,  g)  =  E(fg)  .  Since  our  problem  involves  only  second-order 
moments,  we  can  restate  it  as  one  for  a  Hilbert  space  W,  as 
Kolmogorov  first  emphasized  in  1940,  cf.  [12] .  To  get  the  usual 
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probabilistic  version  of  the  theory  we  must  of  course  think  of  this  U 

as  being  L2(Q,  8 ,  P)  •  But  for  theoretical  purposes  it  is  best  to 

leave  N  unspecified.  Adopting  this  point  of  view,  what  we  have 

00 

is  a  bisequence  of  vectors  { f  )  _00  such  that 

f  =  (f  J.  . . . ,  f^)  where  f*  c  W; 

-  n  n  n  n 

i.e.  each  f  is  in  the  Cartesian  product  of  ft  with  itself  q 
times.  For  q- variate  prediction  the  structure  of  this  hyperspace  is 
crucial  and  must  first  engage  our  attention  (  §2) . 


2.  The  Gram  matricial  structure  of 

Let  W  be  any  ( complex)  Hilbert  space,  q  >  1,  and  be  the 
Cartesian  product  of  W  with  itself  q  times,  i.e.  the  set  of  all  vectors 
f  =  ( f* , . . . ,  f q)  such  that  each  f*  €  W  .  To  make  serviceable  in 
prediction  theory  we  must  endow  it  with  a  Gram  matricial  structure,  as  Doob 
[4, p.594]  noticed.  For  f,  g  c  Mq,  the  qXq  matrix 

(2.1)  (f,  g)  =  [(f1,  gf)] 

(  3) 

is  called  the  Gramian  of  the  ordered  pair  f ,  g  .  It  is  reasonable  to 

(  4) 

think  of  it  as  a  matricial  inner-product  in  view  of  its  properties  : 

(2.2)  { f  ,  f )  >  0  ;  (f,  f)  =  0  =J>f  =  0  ; 

(2.3)  (2  A.f  ,  Z  B  g  )  =Z  Z  A.(f„g.)B*  , 

)t  J  1  1  ke  K  K  K  )€  J  kt  K‘J  ‘  K 

where  J,  K  are  finite  sets  and  A^,  are  qXq  matrices. 
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This  suggests  defining  orthogonality  in  Jiq  by  the  relation: 

(2.4)  fig  «=Mf,  g)  =0  (i.  e.  <i=>  fXlgj,  l<i,j<q.) 

It  also  suggests  taking  linear  combinations  of  f  ^  «  wq  with  q  X  q 
matrix  rather  than  complex  coefficients,  and  calling  a  subset  in  of  #q 
a  linear  manifold,  if  and  only  if 

f  ,  g  c  tn  =>  for  all  qXq  matrices  A,  B,  Af+Bgcin 
The  appropriate  topology  for  Mq  turns  out,  however,  to  be  the 

q 

familiar  one  induced  by  the  ( scalar)  inner  product  in  M  : 

(2.5)  ((f,g))  =  trace  ( f , g)  =  fj  f1  gj  , 

i=l 

or  rather  by  the  corresponding  norm 

(2.6)  If  I  =  N/Uf.f))  If1!2  . 

i=l 

It  is  well  known  that  is  Hilbert  space  under  the  inner  product  (2,  5) . 

We  call  tu  a  subspace  of  ,  if  and  only  if  in  is  both  a  linear  manifold 
and  a  closed  set.  It  is  easy  to  check,  cf.  [36,1,5.8],  that 
(  2.  7)  in  is  a  subspace  of  <==>  in  =  tnq,  where  m  is  a  subspace  of  w  . 

With  these  concepts  of  orthogonality,  distance  and  subspace  we  can  ex¬ 
tend  to  wq  the  well-known  theory  of  orthogonal  prolections  for  Hilbert 
spaces.  Thus  we  have,  cf.  [  36,1,5.8;  II,  1.17], 

2. 8  Lemma .  If  f  «  »q  and  In  is  a  subspace  of  ,  then  there 
exists  a  unique  ?  e  Wq  satisfying  any  one  (  and  therefore  both)  of  the 
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following  equivalent  conditions 


(1)  fern  & 

(2)  f  c  m  &  (f  -  f ,  f  -  f )  <  (f  -g,  f  -g ) ,  gem 

q  th  — i  — 

Let  to  =  to  ( cf.  2.  7) .  Then  the  i  component  f  of  f  is  the  ( ordinary) 

th  i 

orthogonal  projection  of  the  i  component  f  of  f  on  to  . 

A 

2.  9  Def.  The  f  mentioned  in  2.  8  is  called  the  orthogonal  pro¬ 
jection  of  f  on  to  and  written  ( f  I  to )  . 

We  thus  obtain  a  structure  for  which  differs  from  but  also  closely 
resembles  that  of  a  Hilbert  space,  and  which  we  shall  therefore  call 
"Hllbertian1'.  In  terms  of  this  structure  we  can  give  a  definition  of  a  q-ple, 
weakly  stationary  SP,  in  which  all  side  issues  are  purged  and  the  essential 
idea  brought  to  the  forefront,  cf.  [  37,  §  5] : 

00 

2*1°  Def.  A  q-ple.  weakly  stationary  SP  is  a  bisequence  ( f  ) _ ^ 

such  that  each  f  (  and  the  Gram  matrix 


-  n 


(f  ,f 

-nr  -  n 


00 


m-n 


depends  only  on  m-n  .  (  is  called  the  covariance  bisequence 


of  the  SP  . 

00 

Associated  with  a  q-ple  weakly  stationary  SP(fn)_Q0  are  the  present 

5 

and  past  subspaces  ton,  ton  (  ): 
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(  2. 11) 


(5(£k>  k  <  n)  c  wq 
(£(fk>  k  1  n»  1  <  1  <  Q)  9.  M  > 


and  the  terminal  subspaces 


(  2.12) 


to*  =  fe(fk,  aUk),  =  g(f‘  allk,  1  <  1  <  q) 


00  00 

m  =o  m  ,  m  _  =  m 

'  d  n=-°°  d  n=-°°  n 


We  easily  find,  cf.  [36,1,6.5],  that 

( 2. 13)  to  =  tnq  -00  <  n  <  oo 

-  n  n 

ad  obviously 

(2.14)  to  c  m  c  in  ,  cm  . 

--oo— -n  —  -  n+1  —  -  oo 

In  terms  of  these  subspaces  we  can  easily  formulate  the  concept  of 
determinism  and  tersely  restate  the  Prediction  Problem  1.  6: 

2*  15  Def.  We  call  the  SP  deterministic,  non-deterministic.  purely 
non-deterministic.  according  as 

!£-oo  -^oo>  *}-ooC!loo>  -  -oo  “  ^  * 

00 

2.16  Prediction  Problem .  Let  ( f  )  x  be  a  q-ple,  weakly  stationary 

00 

SP  with  covariance  bisequence  (  r^)  ^  and  let  v  >  1  .  Find 

/  n  V 

(i)  the  matrices  AJ,  such  that 

lv  =  livh0)  =lim  S  £(kn)f  _k 

n— -oo  k=0  K  K  ’ 
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( ii)  G  =(f  -  f  ,  f  -  f  )  • 

G  is  called  the  prediction  error  matrix  for  lag  v  .  Following 

00 

Zasuhin  [  43  ]  we  call  p  =  rank  G^  the  rank  of  the  SP  (f  n)-00  • 

Obviously 

(2.17)  the  SP  is  deterministic  <s=>  p  =  0,  i.e.  G  ^  =  Q  . 

The  deterministic  case  is  only  of  peripheral  interest  to  us,  cf.  §1. 

Of  much  theoretical  interest,  though  somewhat  pathological, are  the  non- 
deterministic  cases  of  degenerate  rank  1  <  p  <  q  .  The  really  interesting 
case  from  a  practical  standpoint  is  that  of  full  rank  p  =  q,  for  which 
det  G .  >  0  .  We  note  that  since  G  >  G .  for  v  >  1,  we  have 

(2.18)  p  =  q  =•>  det  G  >  0,  v>l  . 


3.  Elementary  solution  of  the  Prediction  Problem 

Seemingly  the  easiest  way  to  solve  the  Problem  2. 16  is  by  an  extention 
of  the  method  of  undetermined  coefficients.  This  has  been  explained  in 
[36,11,  §2]  ,  and  it  will  suffice  to  indicate  only  a  couple  of  steps.  We 
may  choose  the  aJ^  so  that 

(3,1)  - k  *  --k  1  -O’  --l,,",--n  ’ 


whence, 

(3.2) 


in  block  matrix  notation, 


(n) 


ln)ir 


fA'  '  Av  '1 

*•-0  ’  ' '  * 9  -n  J 


to 


L--n* 
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It  may  be  shown  that  the  second  ( n+1)  qX  (n+1)  q  matrix  on  the  left  is 
invertible  if  and  only  if  detG^^O  .  Thus  in  the  full  rank  case  p  *  q,  theco- 
efficients  A^  can  be  uniquely  determined. 

This  method  involves  solving  a  system  of  linear  equations.  It  would 
be  feasible  for  so-called  weakly  ( or  wide  sense)  N-Markovian  processes, 
i.  e. ,  cf.  [  4,  pp.  90,  506] ,  weakly  stationary  ones  for  which 

(3.3)  iv  =  <£.  vl  <s<  f  .k,  k>  °l)  =  (f  vt  (3(f.k,  0  <  k  <N)>,  v  >1 

and  where,  consequently,  for  a  given  v  >  1  there  is  a  fixed  set  of  N  +  1 

matrices  A .......  A..  such  that 

-  O’  ’  -  N 

f  =  A  f  +  A .  f  ,+...+  A  f  T  .  #K 
-  v  -0-0  -1--1  -  N  — N 

One  might  even  be  able  to  shorten  the  computation  by  adapting  for  q  >  1 

the  interesting  devices  suggested  by  Levinson  [15,  §3]  for  q  =  1  .  But 

for  other  types  of  processes,  as  time  flows  and  our  data  accumulate,  we 

would  like  to  let  the  n  in  (3.1)  increase,  and  thereby  utilize  our  additional 

data.  This  would  mean  solving  larger  and  larger  linear  systems  de  novo,  a 

procedure  of  questionable  efficiency. 

It  was  Wiener' s  belief  that  an  efficacious  computational  procedure 

would  emerge  from  a  deeper  analysis  of  the  problem.  We  now  turn  to  such 

analysis. 
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4.  The  shift  operator  and  Wold-  Zasuhln  decomposition 


Let  ( f  )  M  be  a  q-ple,  weakly  stationary  SP  .  Then  as  Kolmogorov 
[12]  showed,  there  is  a  unique  unitary  operator  U  on  tI\MC  n  onto  to 
such  that 

(4.1)  U(fn*  =  fn+l*  <  n  <  °°,  1  <  i  <  q  .  (  6) 

U  or  rather  its  inflation  U,  defined  by 

(  4.  2)  U  (  f )  =  (  Uf\  .  .  . ,  Uf  Q) ,  f  =  (  f  S  ^  €  Wq 

is  called  the  shift  operator  of  the  SP  .  Obviously  U  is  an  operator  on 

tn  onto  fo  such  that 

(4. 3)  U(fn)  =ln+i  -oo<n<«5  . 

Now 

u  (m  )  =  to  .cm  . 

n  n-1  -  n 

Hence  (  7) 

* 

(  4.  4)  V  =  Rstr.  U  =  an  isometry  on  m„  onto  m  ,  . 

d  ThQ  o  -1 

The  theory  of  this  isometry  V  subsumes  the  time-domain  analysis  of 
our  SP,  as  we  shall  now  indicate. 

Since  the  appearance  of  von  Neumann  and  Murray' s  work  on  operators 
it  has  been  known  in  some  implicit  form  that  if  V  is  an  isometry  on  a 
Hilbert  space  H  onto  R  c  JJ ,  then 

#  =  ^  vk(W)+£  Vk(R-4,  Vj(R-4  1  vNr1) 

k=0  k=0 
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where  the  two  subspaces  on  the  right  side  of  the  equality  are  themselves 
orthogonal.  But  the  great  importance  of  this  result  has  emerged  only  recently, 
cf.  Halmos  [6  ].  As  indicated  in  [22,2.8]  it  extends  to  if  V  is  the 
inflation  to  of  an  isometry  on  H  and  RC.  j)^  is  the  range  of  V  , 
then 

(4.5)  t)q  =  “  Vk(Hq)  +  Yi  Vk(RX),  V*(  RX)  J_  Vk(  Rx) 
k=0  ‘  k=0 . 

where  the  subspaces  on  the  right  side  of  the  equality  are  again  orthogonal 

but  in  the  sense  of  (  2.  4) . 

Turning  to  our  SP,  and  applying  {  4.  5)  with  M  and  V  as  in 
(  4.  4) ,  we  get  at  once 

00 

%  =  +  u“k(inf1^  m0)  . 

k=0 

Now,  and  this  is  crucial,  we  can  show  that  for  a  non-deterministic  SP 

tn-^j  =  <S(90) ’  where  ?o  = -o  "  > 

which  means  roughly  that  R-^  =  o  ft  Q  is  "one-dimensional".  Letting 

g  k  =  U  gQ,  we  readily  obtain  for  any  n 

oo  oo 

<4-6>  SlVk1’  "Uc-J-S  (to(qk’  • 

k=0  k=-°° 

g 

This  is  the  Wold-Zasuhin  decomposition  of  the  subspace  jn^  (  )  . 

In  this  decomposition  the  vector 

(4.7)  9  =yn20.  where  ®0  =  to_(io^-i) 

d  d 
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In  gg 

is  called  the  n  innovation  vector  of  our  SP,  and  (g^)  M  Is  called 


its  innovation  SP  .  Obviously 


(  4. 8)  ( g  ,  g  )  =  6  G  ,  where  G  =  ( g  n.  g  J  , 

and  since  U g  Q  =  f  ^  -  ( f  Jin  Q)  =  f  ^  -  f  ^  (cf.  2.16) ,  we  see  that 

(  4.  9)  G  =  Gj  =  the  prediction  error  matrix  for  lag  1  . 

It  is  convenient  to  "normalize"  the  innovation  vectors.  For  this  we 
think  of  the  matrix  G  as  a  linear  operator  on  Cq  to  Cq,  C  being  the 
complex  number  field.  Let  the  matrix  ]_  represent  the  projection  on 
Cq  onto  the  range  of  G  .  It  is  easy  to  show  that  there  is  a  unique 
qXq  matrix  S-  such  that 


(4.10) 

Indeed, 


(4. 11) 


HI1  =  J1  =  ,  Hn/G  =  J  =\/G  •  H  . 


H  =  N  G  +  j-4' 


which  shows  that 

(4.12)  H  is  invertible,  hermitian  and  positive-definite  . 

Now  let  hn=Hgn  .  Then  we  find,  cf.  [22,  (3.4)  etseq.  ], 

(4.13)  g  -Jg  =  '/G  h  ,  ( h  ,  h  )  =  Smn  J ,  J1  h  =  0  . 

-n  — n  -n  -m  -n  -  mn  -  -  -n 

th 

We  call  h  the  n  normalized  innovation  vector  of  our  SP,  and 
00  . 

l  h  )  its  normalized  innovation  SP  .  In  the  full  rank  case  o  =  a  .we 

_  fl_00  ■  —  —  ~  ’ 

have  det  G  *  0,  and  so  we  can  define  the  h^  by  the  simple  equation 
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h  ss  (n/G)  g  .  Since  in  this  case  J  =  I.  we  have  (h  ,  h  )  =6  I 

-n  - n  —  -m  -n  mn  - 

As  shown  in  [22,  3.  2,  3.5]  the  decomposition  (4.6)  of  the  subspace 

00 

Ih  yields  a  decomposition  the  process  (fn)_00  itself: 


(4.14) 


f  =u  +  v  ,  u  iv,  -°°  <  m,  n  < 

-n  -n  - n’  - m  ^  -  n’  ’ 


where  (u^)  w  is  a  ( purely  non-deterministic)  one-sided  moving  average 

00 

of  the  normalized  innovations  and  has  the  same  rank  as  (f  )  ^  ,  and 

-n  -oo  ’ 

00 

(v  )  M  is  purely  deterministic.  More  fully, 

00  00 

(4.15)  un  =  £  AkN/£h_k,  ElAk-JG||<®,  (9) 


where 


(4.16)  Ak\/G  =  (f  Q,h_k) ,  A0n/G  =  n/G,  A  Qg  q  =  g  q,  AfcG  =  (f  0,9_k) 
are  unique,  although  the  Ak  are  not  unique.  Also 

<4-17>  Yn  =  (£nl!S.J  • 

The  relations  (  4. 14)  -(  4. 17)  constitute  an  alternative  form  of  the  Wold- 
Zasuhin  decomposition. 

It  is  well  known  that  the  conditions  (  4. 14)  and 


(4.18) 


(  u  )  is  purely  non-deterministic 
00 

( v  )  ~  is  deterministic 

-  n  -oo 


do  not  together  characterize  the  Wold-Zasuhin  decomposition.  An  extra 
condition  is  needed,  which  is  usually  stated  (with  an  obvious  notation) 


in  the  form 
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(4.19)  tn(u)  cm(f)  for  some  integer  n  ( 10)  . 

-  n  —  -  n 

Does  (4.19)  work  with  n  =  00  ?  ( ^)  .  Recently  Robertson  [26,  App.  B] 
has  shown  that  the  answer  is  in  the  negative  for  q  >  1,  but  that  the 
stronger  condition 

( 4.  20)  to  ^  c  to  ^£)  8.  rank  ( u  n)“M  =  rank  ( f 

does  work  for  any  q;  i.  e.  (4.14),  (4.18)  and  (4.20)  together  characterize 
the  Wold-Zasuhin  decomposition.  A  recent  result  of  Robertson  [  27  ]  on 
the  wandering  subspaces  of  unitary  operators  yields  a  nice,  spectral  free, 
proof. 

5.  Spectral  analysis 

00 

The  shift  operator  U  of  our  q-ple  weakly  stationary  SP  ( f  ) 

—  n  — 00 

has  a  spectral  resolution: 

27T  e 

(5.1)  U  =/  e"10E(d0) 

0 

where  E  is  a  projection-valued  measure  over  (  [0,  2*]  being  the 

family  of  Borel  subsets  of  [0,  2*]  .  By  taking  the  inflation  E  of  E  we 
associate  two  new  measures  with  our  SP  : 

( i)  a  #q-valued  countably-additive,  orthogonally-scattered  (  c.  a.  o.  s. ) 
measure  %  defined  by 

(  5.  2)  £(B)  =  E(B)f  ,  Be  B  ; 

d 

so-called,  because  of  its  decisive  property 
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B,  C  £  B  &  B,  C  disjoint 


£> 


i(B)j.i(C)  ;  (12> 


( ii)  a  q  X  q  non-negative^  hermitian  matrix-valued  measure  M 
defined  by 

(5.3)  M(B)  =(E(B)f  Q,  E(B)f  Q)  =  ( |(  B) ,  £(  B) )  B  £  B  . 

d 

We  then  introduce  the  well  known  a  X  q  spectral  distribution  F  of  our 
SP  by  the  definition 

(5.4)  F(0)  =  2*  M(O,0]  0<6<2tt  . 

d 

Likewise  one  could  define  the  q-ple  process  of  orthogonal  increments 

(r  ,  0  £  (  0,  2tt]  )  associated  with  our  process  by 
J  0 

ri  _  =  2tt  |(  0,0]  0  <  0  <  2tt  . 

6  d  ' 


Next,  for  a  complex-valued  function  <j>  on[0,2ir]  we  define  the 
integrals 

2tt  2tt  2tt 

/  <M  ©)§(  d  0),/  <}>(0)M(d0),  /  <j>(  0)d  F(  0) 

0  0  0 


to  be 

2tt  2tt  2tt 

(/  «K e)  de)  [/  <t>(0)M..(de)],  [f  «6)  dF  (8)]  . 

0  1-1  0  1J  0  J 

These  definitions  make  sense,  since  the  components  I1  of  |  are  tt-valued 

c.  a.  o.  s.  measures  for  which  a  theory  of  integration  akin  to  that  given  in 

Doob' s  book  [4,Ch.IX,  §2]  is  available,  and  the  entries  M  ,  F..  ofM,F  are 
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comp  lex- valued  measures,  and  complex-valued  functions  of  bounded 
variation.  With  these  definitions  we  easily  get  the  spectral  representation 


of  our  SP  and 

(5.5) 

(5.6) 


of  its  covariance  ( cf.  [36,1,  7.1]): 

f  =  /  e'ni6  E(d8)f  =  /  e"n10  §<d8) 
n  0  0 

r  =  /  e~niG M( d6)  =~  /  e“ni0dF(0)  . 

"n  0  ~  0 


Finally,  we  define  matricial  Riemann  Stielties  integrals  of  the  fosm 


2ir 

/  $(0)dF(0)*(0)  ,  where  $  ,  are  continuous  matrix- valued  functions, 

0 

and  F  a  matrix-valued  function  of  bounded  variation,  by  adopting  the 
classical  pattern,  cf.  [36, 1, §4].  From  (5.6)  we  then  get 


(5.7)  (E  A  £  ,  Z  §  f  )  =-^  /  (Z  A  eii0 

)<  J  1  J  k£  K  K  K  ^  0  j€  J  1 

where  J,  K  are  finite  sets  of  integers  and  A.,  B, 

“J  — K 


v1  kiG  * 
)  dF(  0)  (Jj  B,eK1°) 

kt  K 

are  qXq  matrices, 


cf.  [36 , 1,  7.  9(  a)  ] . 

It  is  natural  to  ask  if  the  equality  (5.7)  continues  to  hold  when  limits  of  linear 
combinations  and  of  trigonometrical  polynomials  are  taken  on  the  two  sides. 

2tt 

This  raises  the  preliminary  question  as  to  how  J  $(  0)  d  F(  0)  ^r(  0)  or  equivalently 
2tt  0 

J  $(  G)  M(d0)  >£(  0)  is  to  be  defined  when  $,  v  are  any  ( discontinuous)  matrix- 
0  " 


valued  functions  on  [  0,  2tt]  with  Borel  measurable  entries.  We  can  pose  this 
question  for  any  non-negative,  hermitian  matrix-valued  measure  M,  not  just 
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the  one  defined  in  (5.3).  The  further  development  of  the  spectral  theory 
of  q-ple  processes  hinges  on  the  answer  (§6) . 

6.  The  space  L2  m  f°r  a  non-negative  hermitian  matrix-valued  measure. 

Let  M  be  any  qXq  non-negative, hermitian  matrix- valued  measure 
over  ( [  0,  21T] ,  J3)  and  suppose  that  we  have  in  some  way  defined  the 
integrals 

2tt 

(1)  /  $>(  G)  M(  d0)  ^(  0) 

0  ■ 

for  q  X  q  matrix-valued  functions  V  with  Borel  measurable  entries. 

It  would  then  be  natural  to  define  the  L  class  with  respect  to  the 
measure  M  by 

2v 

(6.1)  L  =  L  J[0,2ir],  fi,  M)  =  {*:  /  $(0)M(d0)*  (0)  exists)  . 

- m  —  c  d  ■  o  ” 

Now  one  of  the  fundamental  properties  of  the  class  L,  . ,  when  q  =  1  , 

2,  M 

i.  e.  when  $,  M  are  complex-valued,  is  its  completeness  under  the 
norm 

2tt 

I  *L=  /  I  tie)  I  M(de)  . 

m  J0 

This  is  the  core  of  the  celebrated  Riesz-Fischer  Theorem.  For  q  >  1  , 

the  corresponding  norm  would  appear  to  be 

2tt  * 

(6.2)  |$L ,=  N/trace  f  s(0)M(d0)  (0)  . 

-  M  ' 
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Our  definition  of  the  integral  (1)  would  be  useless,  were  the  space  ^ 
defined  in  ( 6. 1)  to  be  incomplete  under  the  norm  (6.2).  We  are  thus  faced  with 
the  following  problem: 

Problem.  Define  the  integrals  ( 1)  in  such  a  way  that  the  space 
L  ^  m  defined  in  (6.1)  is  complete  under  the  norm  (6.2). 

This  problem  was  settled  by  M.  Rosenberg  [32,  §3]  for  rectangular 
matrices  and  by  Yu.  A.  Rosanov  [31,  Ch.  I,  §7]  for  vectorial  $, 

independently  around  1963.  We  shall  follow  Rosenberg' s  more  in¬ 
clusive  treatment.  He  observed  that  a  qXq  non-negative,  hermitian matrix¬ 
valued  measure  M  is  invariably  absolutely  continuous  with  respect  to  the 
non-negative^  real  measure  trace  M  .  Writing  tM  for  trace  M,  it  follows 
that  each  entry  of  M  has  a  Radon-Nikodym  derivative  with  respect  to  tM  . 
The  q  X  q  matrix  dM/d-rM  of  these  derivatives  has  nice  properties,  and 

this  suggests  adoption  of  the  definition 
2"  2tt 

(6.3)  /  *(8)M(d0)  »(8)  =  /  *(e)2Ja.(6)  *<9)  •  TM(de)  , 

0  d  0  - 

the  last  integral  being  defined  ( earlier)  as  the  matrix  of  Lebesgue  integrals 

of  the  entries  of  $(dM/dTM)  ^  with  respect  to  the  ordinary  measure  tM  . 

Rosenberg  showed  that  this  definition  solves  the  problem.  Thus 

6.  4.  Thm.  (  Rosenberg-Rosanov)  With  the  definitions  6.  3  and  6.1 

13 

the  space  w  is  complete  under  the  norm  (6.  2) .  (  ) 

— Z,M 


In  case  the  measure  M  is  absolutely  continuous  with  respect  to 

Lebesgue  measure  L,  it  follows  at  once  from  simple  properties  of  Radon- 

Nikodym  derivatives  that  {  6.  3)  is  equivalent  to  the  simpler  definition 
2tt  .  2tt 

(6.5)  /  $(0)M(d0)  tf(0)  =  /  $( 0)  F'  ( 0)  \Ed  G)  d0  , 

0  "  ’  d  0 


where  F  is  as  in  ( 5.  4) .  The  work  of  Rosenberg  and  Rosanov  thus  sub¬ 
sumes  the  partial  results  obtained  previously  on  the  basis  of  (6.5),  e.  g. 
those  in  [36,  II,  §4] . 

Having  defined  the  integrals  ( 1) ,  we  can  introduce  in  L 


2,M 


matrix-  and  complex-valued  inner  products  by  the  definitions: 


2tt 


(6.6)  (  $,  ^)M  =  /  $(  0)  M(d6)  f  (8) ,  $ ,  *  c  L 


2,M 


(6.7)  (U,tf))M  =  trace  ($,^)M 

d 


The  norm  introduced  in  (6.  2)  can  then  be  written 


(6.8) 


l?lM=  • 


The  equations  (  6.  6)  -(  6.  8)  are  comparable  to  (  2. 1) ,  (  2.  5) ,  (  2.  6) . 

The  fact  that  L_  w  is  complete  under  the  norm  (6.8)  thus  shows  that  L  _  .. 

-  2,M  -  2,  M 

is  Hilbertian  in  the  sense  of  §2.  Thus  every  non-negative. hermitian  matrix- 
valued  measure  M  generates  a  Hilbertian  space  l2,M  * 

This  result  holds  in  particular  for  the  measure  M  ,  defined  in  (  5.  3) , 

00 

which  is  associated  with  the  SP  (f  )  ^  .  Thus  every  q-ple,  weakly 
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stationary  SP  possesses  two  Hilbertian  spaces  c  and  L  ^  ^  . 

We  shall  refer  to  them  as  the  temporal  and  spectral  spaces  of  the  SP  . 

For  q  =  1  we  know  that  they  are  isomorphic  Hilbert  spaces  under  a  natural 
correspondence,  cf.  [  12,  (  2.  7)  ] .  This  isomorphism  survives  when  q  >1  ( §7) . 


7.  Isomorphism  between  the  temporal  and  spectral  spaces 

Let  £  be  any  wq-valued,  c.  a.  o.  s.  measure  over  ( [  0,  2tt]  ,  &) ,  cf.  §5,  and  let 

M  (B)  =<§(B),  |(B)),  Be  fi  . 

Then  is  clearly  a  qX  q  non-negative,  hermitian  matrix- valued  measure. 

The  crucial  fact  that  the  associated  space  L  2  has  a  (  complete)  Hilbertian 

’  — |  2  it 

structure  enables  us  to  define  integrals  of  the  type  f  $(0)£(d0) ,  where 

0 

-  €  -2  M  ’  *°^owin9  essentially  the  procedure  adopted  for  stochastic 

integration  in  Doob' s  book  [  4,  Ch.  IX,  §2] ,  and  to  prove  the  following 

theorem  (for  details,  cf.  Rosenberg  [32,  §4] ) : 

7.1  Thm.  Let  ( i)  |  beany  Jiq-valued,  c.  a.o.s.  measure  over 

<[0,2ir]),  B),  (ii)  M  ,<B)  =  (£<B),£(B>),  B  s  B,  ( iii)  SU)  = 

£  d  d 


QJ{4(B):  BtB}C)i4  (14)  .  Then 


(a)  g  t  <=>  3  $  €  L  such  that  g  =  J  $(  0)£(  d0) 


2tt 


2,M 


0 


(  b)  the  $  in  (  a)  is  uniquely  defined  up  to  a  set  of  zero  M.  measure 

2tt  * 


( c)  the  correspondence  $>— /  $>(0)£(d0)  is  an  isomorphism  on 

('  0 

L_  onto  the  subspace  8  *  of  ;  i.e.  it  is  one-one  on  L_ 

-2,M  - 

(  t)  5  6 

Jr  5  and,  cf.  (6.6), 

2tt  2tt 

(*,*).,  =(/»( e)§(de),  /  *( e)|(de)), 

"  x4  o  o  ’  -  4 


onto 
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Each  Hq-valued,  c.  a.  o.  s.  measure  thus  carries  with  it  two  isomorphic 

Hilbertian  spaces  and  L_  .  This  applies  in  particular  to  the 

-2,M 

5  oo 

c.  a.  o.  s.  measure, defined  in  (  5.  2),  which  is  associated  with  the  SP  (  f  n)_00 

But  now  ^(B)  =E(B)fQ,  and  so  we  have 

(7.2)  g(^  =  <g{E(B)f0:  B  €  «}  x  (g{Unj[  Q:  -oo<n<oo}=^co  . 

Here  the  first  and  third  equalities  are  obvious,  and  the  second  stems  from 
the  basic  connection  between  U  and  E  given  in  (5.1),  as  is  well  known 
from  the  theory  of  cyclic  subspaces  of  Hilbert  spaces.  We  thus  get  as  a 
corollary  of  7. 1  the  result: 

7.3  Thm.  For  a  q-ple,  weakly  stationary  SP  ( f  )  ,  the  corres- 

2tt  n 

pondence  $  —  J  $(0)  E(d0)  f  _  is  an  isomorphism  on  the  spectral  space 
0 

L  _  - ,  onto  the  temporal  space  TTi  _ cz  . 

-  2,  M  -oo- 

This  theorem  shows  of  course  that  the  equality  (  5.  7)  holds  when  limits 

q 

are  taken  in  the  M  and  L_  ..  topologies  on  the  two  sides. 

-  2,  M 

8.  Cross-covariance  and  spectra.  Subordination 

To  treat  simultaneously  two  or  more  q-ple,  weakly  stationary  SP' s 
00  00 

-  n  00*  ^  g  00  ’ ' ' '  ’  ^  is  convenient  t0  use  subscripts  or  superscripts 
f,  g,  . . .  to  distinguish  their  possessions,  e.  g.  to  denote  their  temporal 
spaces  by  >  ID  oo  »  •  •  •  •  The  processes  ( f  n)—00,  (§n)_00  are  said 
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to  be  stationarilv  cross-correlated  ,  if  and  only  if  the  Gram  matrix 

(8.1)  If  ,s„l  =  r^f>9) 

-  m  -  n  -  m-n 

(  f  Q ) 

depends  on  m-n  alone.  The  bisequence  (  _ ^  is  then  called 

00  00  ( f  f) 

the  cross-covariance  of  (f  )_00  with  (g  )-00  •  Obviously  rf  is  the 


-k 


rk  introduced  in  (  2. 10) . 


By  a  slight  extension  of  an  argument  of  Kolmogorov  [  12,Thm.  1]  it  follows 

OO  00 

that  if  ( f  )  .  (g  )  _  are  stationarily  cross -correlated,  then  there  is  a 

—  jy_oo7  -  ft- oO  J  7 

unique  unitary  operator  U  on  the  subspace  clos.  }  onto  itself 


such  that  Uf1  =f111  and  Ug1  =g111,  1  <  i  <  q  .  In  dealing  simul- 

n  n+l  n  n+1  —  “ 

taneously  with  several  such  processes  it  is  therefore  legitimate  to  start  out 

with  a  single  shift  operator  U  on  M  to  M  : 

2tt 


(8.2) 


U  =  /  e 
0 


-i6 


E(dO) 


n  oo  n  00 

and  to  suppose  that  our  SP' s  are  of  the  form  ( U  f )  ,  (  U  g)  ,  . . .  , 

q 

where  f  ,  g  ,  . . .  «  W  ,  and  U  is  the  inflation  of  U  . 

With  each  ordered  pair  of  f  ,  g  e  H9  we  associate  the  qXq  matrix¬ 
valued  cross-measure  M^g,  no  longer  hermitian-valued,  and  the  q  X  q 


cross-spectral  distribution  defined  by 

Mfg(B)  =  (E(B)  f,  E(B)g)  B  €  R 


(8.3) 

(8.4) 


F  {g(  6)  =  2ttM  fn(  0,  0] ,  0  <  0  <_  2tt  . 


fg 
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Obviously 


M  f(B)  =  M,  (B)  ,  B  e  ft 
—  gf  -fg 


(8.5) 


With 


r(  f,9)  =  /  e"nieM  (dG)  =J-f  e"ni0dF  (0)  . 

-n  JQ  -fg  2tt  ^  -fg 

g  =  f,  the  M.,,  Fr,,  r  ^  we  get  are  of  course  the  M,  F,  r 
2  —  fr  -  fr  -n  n 


introduced  for  the  SP  (f  )  in  (5.3),  (5.4),  (5.6). 

_n  2tt 

We  can  define  integrals  of  the  form  /  $(  0)  M(d0)  ^(0),  where  M 

0 

is  any  ( non-hermitian)  matrix-valued  measure, by  a  slight  extension  of  (  6.  3) 

,2tt  rZv  dM 

(8.6)  J  $(0)M(d0)*(0)  =  J 

O'  ”  d  0  ”  Qtr 

where  a  is  not  necessarily  tM  but  some  non-negative  real  measure  with 

respect  to  which  M  is  absolutely  continuous.  We  can  show  that  the 

definition  does  not  depend  on  the  choice  of  a  .  We  can  then  prove  the 

following  basic  result  by  using  a  slightly  extended  version  of  the  opera- 

15 

tional  calculus:  (  ) 

8.  7  Thm.  Let  ( l)  f ,  g  c  il9,  ( ii)  g  =  (  g  lm  ^ ) ,  ( iii)  $  -  *  L2  M 


be  the  isomorph  of  g,  i.  e. 


I  =  < g 1 2} Lf ’ >  =  /  *a(e)-(d0)-o  • 


Then 


(a)  Mgf(B)  =  M.f(B)  =/$.(0)Mff(d9),  B  «  B 


(b)  M--(B)  =  /  $~(0)Mff(d0)$?(0)  -  /*-(0)Mf.(dO),  Be  fi 
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Remark.  8.7(a)  suggests  that  $a  is  in  some  sense  the  Radon- 
Nikodym  derivative  of  Mg^  with  respect  to  .  But  a  theory  of  such 
derivatives  for  matricial  measures  has  not  been  developed  so  far,  and  it 
would  be  premature  to  write 

S  =  (gltn„  )  =  I  ^aJ-<e>E< de>f  , 

o  -ff 

when  q  >  1,  even  though  this  equality  prevails  for  q  =  1  . 

n  oo 

Following  Kolmogorov  [12,  §4]  we  say  that  the  SP  (U  g)_Q0  is  subordinate 
to  the  SP  (Unf)^,  briefly,  g  is  subordinate  to  f,  if  and  only  if  C  .  The 
last  theorem  then  yields  the  following  extension  of  Kolmogorov'  sThms.  8,  9  in  [12] : 
8.8  Cor.  The  following  conditions  are  equivalent: 

(i)  g  is  subordinate  to  f,  i.  e. 

2tt 

(ii)  3$  €  L  such  that  g  =  /  $(8)E(d0)f 

‘^-ff  '  0 

(iii)  3  t  €  -2,M^  such  that  for  any  Be  B 

Mgf(B)  = /£(0)Mff(de),  M  (B)  =/*(0)Mff(d0)#*(0)  . 

B  B 

In  case  g  is  subordinate  to  f,  the  functions  $  in  ( ii)  and  (iii)  are  equal 
a.e.  ( Mff)  . 

The  following  is  M.  Rosenberg' s  unpublished  generalization  of  Kolmogorov' s 
Thm.  10  in  [  12] : 

8.  9.  Cor.  Let  g  be  subordinate  to  f,  and  $  be  as  in  8. 8(  ii) .  Then 
f  is  subordinate  to  g  (i.e.  f ,  g  are  mutually  subordinate) .  if  and  only  if 

dM  *  dM 

rank  {$(0)  (0)  $  =  rank  dTM  (8),  a.e.  ( -rMff)  . 
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Thm.  8.  7  has  many  applications.  For  instance,  we  can  derive  from  it 
the  spectral  distribution  of  the  projection  of  a  component  of  a  q-ple  SP 
on  the  orthogonal  complement  of  the  space  spanned  by  the  rest  of  its  com¬ 
ponents.  First  by  taking  Besicovitch  derivatives  [1]  of  our  matricial 
measures  in  8.  7  with  respect  to  Lebesgue  measure  L  we  can  show  that  ( ^) 
(8.10)  (det  FJf)  F:*  =  F^f(adJFJf)F^,  a.  e.  (L) 

whether  or  not  the  functions  F,,,  F,. «,  F  ,a  are  absolutely  continuous  on 

-  fr  -  qg’  -  fg 

[  0,  2n] .  From  (8.9)  we  can  in  turn  deduce  the  following 

00 

8.11  Lemma  .  Let  ( i)  (f  )  ^  be  a  q-ple,  weakly  stationary  SP,  and 

A^  be  the  determinant  of  the  derivative  of  its  spectral  distribution, 

( ii)  A  .  be  defined  similarly  for  the  ( q-l)-ple  SP  formed  by  the  first 
q-1 

oo  ~  (q ) 

q-1  components  of  (f^)  Q0,  (iii)  f^  be  the  projection  of  the  last  com¬ 
ponent  of  f  n  on  {  (*)  (  f  -00  <  m  <  o°,  1  <  i  <_  q-1)}-*-  .  Then  the  (  real- 

a ), 00 

valued)  spectral  distribution  F^  of  the  (1-ple)  SP  (fn  0n _ ^  satisfies 

the  equation 


In  case  A  .  >  0 
q-1 


A  .(0)  •  F'(0)  =  A  ( 0 )  ,  a.e.  (Leb.)  . 
q-1  q  q 

a.  e.  (  Leb) ,  we  have  of  course 

F^( 0)  =  A  (0)/Aq-1(0),  a.e.  (Leb.). 


This  result  was  obtained  by  Matveev  in  1960,  cf  [23 ,  p.  35,  (  5)  ] ,  in 
the  course  of  deriving  spectral  conditions  for  the  determinism  of  a  q-ple 
SP  .  With  q  =  2  it  reappears  in  1964  in  a  paper  of  L.  H.  Koopmans 
[13  ,Thm.  1],  who  seems  to  have  been  unaware  of  Matveev' s  work.  Indeed, 
many  of  Koopmans'  results  on  coherence  of  processes  [13,14]  are  deducible  from 
those  given  above  and  standard  theorems  on  Besicovitch  derivatives  [1], 
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9.  Spectral  analysis  of  a  purely  non-deterministic  SP 


00 


It  is  easy  to  show,  cf.  [36 , 1,  7.  7(  a)  ],  that  if  (f  )-00  is  a  moving 
average  SP .  i.e. 


00 


00 


(9.1)  Ckhn_k>  (hm,hn)  =8mnJ,  Z  lfikjlE< 

kr-00  k=-°0 


00 


where  J  is  a  projection  matrix,  then  its  spectral  distribution  F  is  absolutely 


continuous  and 


oo 


kie 


(9.2)  F'(e)  =  tf(e10)tf(e1V,  a.e.  (Leb.), where  tf(e30)  =  £  C  Je 

k=-°° 


The  inequality  in  (9.1)  shows  that  each  entry  of  ^  is  in  L 2  on  the  unit 

circle  C  =  [  |z|  =  1]  of  the  complex  plane,  a  fact  we  shall  express  by 

writing  tfc  L  2(  C)  . 

00 

Now  let  (f  n)  M  be  any  non-deterministic  process.  Then,  as  em¬ 
phasized  in  §4,  the  coefficients  A^n/G  ,  which  occur  in  its  Wold-Zasuhin 
decomposition: 

oo 


f  =  u  +v  =  Tj  AWGh  ,  +  f  Rl  )  , 
-  n  -n  -n  “  ~k  -  -  n-k  -  n-  -  °°  * 

k=0 


(9.3) 


00 


Z  Ia.^gI  <  00  , 

k=0  b 


are  uniquely  determined  by  the  SP  .  This  fact  and  (  9.  2)  suggest  associating 

with  our  SP  the  function  $  defined  by 

00 

i0v  V  .  /_  ki0 


(9.4) 


*(e10)  =  Yj  A.-s/Ge* 


d  k=0 
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We  call  $  the  generating  function  of  ( f  )  x  .  It  plays  a  vital  role  in 

the  theory.  The  inequality  in  (  9.3)  shows  that  $  t  L^C)  .  But  the  Fourier 

0+ 

series  of  $  s  devoid  of  negative  frequency  terms, and  actually  $€  (C)  , 

where 

2  TT 

(9.5)  LG+(C)  =  {*:  tfe  L  (C)  k  J  e~klG  Md  elG)  d0  =  0  for  k<0}  . 
d  '  '  0 

From  (  9.  2)  we  immediately  get: 

00 

9.6  Thm.  The  purely  non-deterministic  part  (  u^)  ^  inthe  Wold-Zasuhin 
00 

decomposition  of  (f  )  M  has  an  absolutely  continuous  spectral  distribution  F  u 
such  that 

F'  (0)  =  $  (elG)  $*(  elG)  a.e.  (  Leb. ) 

0  +  oo  *  . 

where  $  c  L_  (C)  is  the  generating  function  of  (f  )  .  and  <j>  (•)  =  {$(*)  ) 

-  2  -  n  -00  -  d 

00 

In  case  (  f  )  ^  is  itself  purely  non-deterministic,  we  have  v  =  0  , 

u  =  f  ,  F  =  F  ,  and  it  follows  from  9.  6  that  F  itself  is  absoutely 
-  n  -  n’  -  u 

❖  0+ 

continuous  and  F'  =  $  <t>  a.e.,  where  $e  (C)  .  The  converse  also 

holds  as  Rosanov  [  29  ]  has  shown,  cf.  also  [37,2.3].  We  thus  get  the 
following  spectral  characterization  of  purely  non-deterministic  SP' s  : 

9.  7  Thm.  (  Rosanov) .  A  q-ple  SP  is  purely  non-deterministic, 
if  and  only  if  its  spectral  distribution  F  is  absolutely  continuous  and  F' 
admits  a  factorization 

F'(  6)  =  \£(  eiG)  ^  '  (  eiG)  a.e.  (Leb.),  where  LG+(  C)  . 

A  function  in  LG+(C)  admits  a  holomorphic  extension  to  the  inner 

i|  ^ 

disk  D+  =  [  |  z  |  <  1]  and  its  adjoint  ^  a  holomorphic  extension  to  the  outer 
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disk  D_  =  [1  <  |z  |  <  oo]  .  Thm.  9.  7  thus  reveals  an  interesting  connection 
between  q-ple  prediction  and  the  inner-outer  factorization  of  qXq  matrix¬ 
valued  functions  in  Lj(C)  . 

10.  Spectral  analysis  of  a  full-rank  SP 

Let  F  be  the  spectral  distribution  of  a  q-ple,  weakly  stationary  SP 
00 

(  f  )  oo,  and  G  be  its  prediction  error  matrix  for  lag  1,  cf.  (4.  8), (4.  9). 
Then,  cf.  [36,1,7.10;  or  8, 1,  Thm.  8], 

1  Zv 

(10.1)  det  G  =  exp  {—  f  log  det  F'  ( 0) do}  . 

2*  0 

This  fundamental  equality,  first  stated  by  Whittle  [34]  in  1953,  is  a  de- 
terminental  extension  of  the  Szego- Kolmogorov  identity  [  12,  (  8.  44)  ]  for 
q  =  1,  and  shows  at  once  that 

(10.2)  p  =  rank  of  SP  =  q  «=>  log  det  F(  • )  e  L.[  0,  2tt  ]  . 

d 

We  thus  have  a  perfect  spectral  characterization  of  the  full-rank  case.  A 
less  obvious  consequence  of  (10.1)  is  the  following  important  result 
[36,1,7.11]; 

*  00 

10.3.  Thm.  Let  ( i)  (f  )  be  a  q-ple,  weakly  stationary  SP  of 

"  n  — oo 

full  rank  q,  ( ii)  f  n  =  u  +  vr  be  its  Wold  decomposition  (  4. 14  et  seq) , 

00 

( iii)  F,  F  ,  F  be  the  spectral  distributions  of  the  processes  (f  )  , 

—  ’  —  u  —  v  -  n  -oo  * 

00  oo 

(u  )  ,  ( v  )  ,  (iv)  F  ,  F,  be  the  absolutely  continuous,  and  non- 
-  n  -oo7  -n-co7  -a'  -  d  7 

17 

absolutely  continuous  parts  of  F  (  )  .  Then 
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F  =  F  and  F  =  F .  . 

-u  -a  -v-b 

We  may  paraphrase  this  result  by  saying  that  in  the  full  rank  case 
there  is  concordance  between  the  Wold-Zasuhin  decomposition  in  the  time- 
domain  and  the  Lebesgue-Cramer  decomposition  in  the  spectral  domain. 

On  combining  10.  3  and  9.6  we  get  another  important  result  on  full-rank 
processes.  Since  F  =  F  and  F  '  =  0,  we  have  F '  =  F'  ,  whence: 

10.  4  Thm.  The  derivative  F'  of  the  spectral  distribution  of  any 
full-rank  SP  admits  the  factorization 

F'(0)  =  $(  ei0)  (  e^) ,  a.e.  (  Leb. )  , 

where  $  «  L^+(C)  is  the  generating  function  (  cf.  9.  4) . 

The  results  (10.  2) ,  10.  3,  10.4  shed  much  light  on  the  full-rank  case 
p  =  q,  and  it  is  natural  to  ask  whether  corresponding  results  are  available 
for  0  <  p  <  q  •  We  have  three  questions: 

Q.  1.  Given  0  <  p  <  q,  what  is  the  spectral  n.  &  s.  c.  that  a  SP 
have  the  rank  p  ? 

Q.  2.  For  a  SP  of  rank  p  such  that  0  <  p  <  q  ,  are  the  Wold- 

Zasuhin  and  Lebesgue-Cramer  decompositions  in  concordance? 

If  not,  what  extra  condition  would  restore  this  con¬ 
cordance? 

Q.  3.  For  a  SP  of  rank  p  such  that  0  <  p  <  q,  does  F'  admit  a 

#  0+ 

factorization  F'  =  ^  a.  e.(  Leb.  )f  where  L^  (  C)  ?  If  not. 
what  additional  condition  would  ensure  such  factorization? 
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Of  these  the  most  basic  question  Q.l  is  still  unanswered,  despite  some 
important  work  by  Matveev  which  we  shall  discuss  in  §12.  Q.  2  and  Q.  3 
have,  however,  been  answered  satisfactorily,  cf.  §§11,  12. 

11.  Concordance  of  Wold-Zasuhin  and  Lebesgue-Cramer  decompositions 
for  denqenerate  ranks 

In  1959  the  writer  showed  that  the  answer  to  the  first  part  of  question 
Q.  2  (  §10)  is  in  the  negative.  He  gave  an  example  of  a  2-ple  process  of 
rank  1  for  which 

(11.1)  {0}  #  M  &  F  is  absolutely  continuous, 

[19,  §3].  For  this  SP  concordance  between  the  W.  Z.  and  L.  C.  decomposi¬ 
tions  fails  since  F  *0=F,  ,  cf.  10.3  et  seq. 

For  2-ple  processes  of  rank  1,  the  writer  also  gave  the  extra  condition 
needed  for  concordance,  viz.  det  F'  (0)  =  0,  a.  e.  (Leb. ) ,  cf.  [19,  4.  5] .  Now 
it  is  easy  to  show  that  rank  F'  (6)  >  rank  F  '^  ( 0 )  =  p,  a.  e.  (Leb.).  (Just 
combine  9.  6  with  13.  3  below) .  Hence  for  q  =  2,  p  =  1  our  result  may  be 
written: 

concordance  <=>  rankF'(9)  =1,  a.e.  (Leb.)  . 

A  complete  generalization  of  this  result  was  obtained  by  Robertson  [26,10.  2] 
in  1963: 

11.2  Thm.  (Robertson)  For  any  q-ple, weakly  stationary  SP  of  (any) 
rank  p,  there  is  concordance  between  the  W.  Z.  and  L.  C.  decompositions, 
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if  and  only  if  rank'F'(G)  =  p  a.  e.  (  Leb.  )  ,  where  F  is  the  spectral 
distribution  of  the  SP  . 

To  prove  this  theorem  Robertson  used  a  result  on  the  ranges  of  the 
matrices  F'  (0),  viewed  as  linear  transformations  on  to  , 

C  being  the  complex-number  field.  This  result  is  itself  interesting 
[26,9.11]: 

00  00  00 

11.3  Lemma  (Robertson).  Let  (x  )  .  (y  )  .  (z  )  be  q-ple 

-  -  n  -cc  -  n  -oc  ’  -  n  -oo  ^ 

weakly  stationary  SP' s  with  the  same  shift  operator,  and  let 

x  =  y  +  z 
-  n  -  n  -  n 


5lX>  =  6‘J’ 


T  ia 


(  z) 


-  oo 


*Ly) 


1  in 


(z) 

00 


Then,  with  an  obvious  notation, 

(a)  Range  F’  (0)  -  Range  F'  (0)  +  Range  F  '  ( G ),  a.  e.  (Leb.) 

—  x  "  y  “  z 

(b)  Range  F^(  0)  r\  Range  F^(  0)  =  {0 } ,  a.  e.  (Leb.) 

(d)  rank  F'  (  0)  =  rank  F'  ( 0)  +  rank  F'  (0)  a.  e.  (  Leb.  )  . 

-x  - y  -  z 

Some  of  these  results  were  duplicated  independently  by  Jang  Ze-pei 
[10  ]• 


12.  Degenerate  rank  factorization 

The  writer' s  example  mentioned  in  connection  with  the  question  Q.  2 
in  §11  also  shows  that  the  answer  to  the  first  part  of  Q.  3  (  §10)  is  in  the 
negative.  For  this  consider  the  2-ple  process  of  rank  1  satisfying  (11. 1) . 
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v  0+ 

Were  F'  =  ^  ^  a.e.  with  ^  €  L  ^  ( C) ,  then  since  F  is  absolutely  con¬ 
tinuous,  it  would  follow  from  Rosanov' s  Thm.  9.  7  that  the  SP  is  purely 

non-deterministic,  in  contradiction  to  the  assertion  m  _ ^  {0}  in  (11.1). 

As  in  the  case  of  Q.  2,  the  extra  conditions  needed  to  secure  a  positive 
answer  to  Q.  3  were  first  given  for  the  case  q  =  2  ,  this  time  by  Wiener  and 
the  writer  [37,  4. 1]  in  1959;  and  the  result  was  then  fully  generalized 
by  Matveev  [  24  ]  in  I960,  but  for  processes  with  continuous  time.  Whereas 
in  the  full  rank  case  p  =  q  we  encounter  ( holomorphic)  functions  of  the 
Hardy  class  H  on  the  disk  D  ,  or  rather  their  radial  limits  in  L^+(C)  , 
for  the  degenerate  rank  cases  1  <  p  <  q  we  encounter  quotients  of 
functions  on  D  ,  i.e.  the  ( meromorphic)  beschranktartiqe  functions  intro- 

T 

troduced  by  R.  Nevanlinna  to  round  off  the  Hardy  class  theory.  (  For  a  brief, 
relevant  account  see  [37,  &  Note  on  p.  308], )  The  final  result,  cf. 

[24, Thm.  1],  is  as  follows: 

12.1  Thm  .  (Matveev)  Let  F  be  the  spectral  distribution  of  a 
q-ple  SP  .  Then  F'  admits  a  factorization 

F'(  0)  =  ^(  ei6)  ^  (  ei6)  a.e.  (  Leb. ) ,  where  L^+(C)  , 

if  and  only  if 

( 1)  rankF'(0)  =  const,  p  ,  a.e.  (Leb.) 

18  •  • • 1D 

(  2)  there  is  a  principal  pXp  minor  (  )  M  =  M,  .  of  F'  such  that 

1  •  •  •  •  1  ** 

P 

log  det  M  €  L^[  0,  2tr]  , 
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and  for  each  },  and  each  ke{l,  ...,p} 

1  P 

det  M*,  (  0)  ie 

- iS -  =  lim  \£,  .  (re  ),  a.e.  ( Leb. ) ,  y.  ,  beschranktartige. 

det  M(  0)  r-*l-  l^k  ^ 

i  th 

where  denotes  the  minor  obtained  from  M  by  replacing  the  i^  row  of 

th 

M  by  the  appropriate  entries  of  the  i  row  of  F'  . 

Matveev' s  proof  is  based  on  the  fact: 

(12.2)  ^6  L^+(C)  rank^(ei0)  =  const,  a.e.  (Leb.)  , 


which  emerges  on  applying  theorems  on  Hardy  class  functions  to  the  sums  of 
the  determinants  of  principal  rXr  minors  of  ,  1  <  r  <  q,  cf.  [16,2.  5], 

Matveev  showed  that  if  the  constant  in  ( 12.  2)  is  p  ,  then 

.  ie.  *  ie.  v/  ie.v*.  ie. 

¥(e  )^(e  )  =  X  ( e  )  X  (e  )  , 


where 


X  =  [...  |  0  ]  e  L2+(C)  . 

qXp  qXq-p 


The  example  mentioned  above  and  Thm.  12. 1  together  answer  the  question 
Q.  3  completely.  Thm.  12. 1  also  answers  completely  the  following  question 
related  to  Q.  1: 

Q.  1' .  Given  0  <  p  <  q,  what  is  the  spectral  n.  &  s.  c.  that  a  SP 
be  purely  non-deterministic  and  have  rank  p  ? 

The  answer  is  immediate  from  the  Theorems  9.  7  and  10. 1  of  Rosanov 
and  Matveev: 
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12.  3  Cor.  A  q-ple  SP  is  purely  non-deterministic  and  of  rank  p  , 
if  and  only  if  its  spectral  distribution  F  is  absolutely  continuous  and  F' 
satisfies  the  conditions  ( 1) ,  (2)  of  Thm.  12. 1. 

Unfortunately,  this  still  leaves  us  in  the  dark  as  regards  Q.  1.  For 
instance,  the  answer: 

F'  satisfies  the  conditions  (1),  (2)  of  Thm.  12.1 
won'tdo.  Indeed,  by  Robertson' s  Thm.  11.  2 the  condition  (1)  of  12. 1  ensures 
concordance,  whereas  we  knowthat  there  are  non-deterministic  processes  for  which 
concordance  fails.  A  proper  answer  to  Q.  1  would  be  a  major  contribution. 


13.  Spectral  and  autoregressive  representations  for  the  predictor  of  a 
purely  non-deterministic  SP 

00 

We  shall  now  turn  to  prediction  proper.  Let  ( f  ^)  ^  be  a  q-ple  , 

th 

non-deterministic  SP,  h  be  its  n  normalized  innovation,  and 

f v  “  ( l  v  I-  ())  be  the  prediction  °f  f  v  with  lag  v  >  1  .  Since  hQ  , 

?  c  Tti  they  have,  cf.  Thm.  7.3,  isomorphs  W,  Y  e  L.  __  such  that  ( 19) 
-  v  -  go'  -  7  -  v  -  2,  M 

2tt  a  2tt 

(13.1)  hn  =  /  e"n10  W(e10)E(d0)f  Q,  f  ^  =  /  Y  e10)  E(  d0)  f  Q  . 

Our  first  problem  is  to  find  for  a  purely  non-deterministic  SP  . 

For  such  a  process  the  Wold-Zasuhin  decomposition  (  4. 14)  -(  4. 17)  shows 


that 


00 


oo 


f  =  Yj  A.  N/Gh  .  ,  f  =  Y.  A.  h  .  . 

-n  “  “k  -  -  n-k’  -v  ^  -  k  -  -  v-k 

k=0  k=v 
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Letting  n  =  0,  taking  isomorphs  and  proceeding  heuristically,  we  get 
00 

r  V  *  t  ^  ki©,,,..  i0.  .  i0.,,r/  i0. 

I=2j  A.vGe  W(  e  )  =$(e  )  W(  e  ) 

~  k--0  * 

00 

-,1  /  i©v  V  *  (k-v)i0_._.  i6\  r  i6\l  ,*rt  i6\ 

(13.2)  Y^(e  )  -  Ij  A^vGe  W(  e  )  =[e  $(e  )J0+W(e  ) 

k=v 

where  $  is  the  generating  function  of  the  SP,  and[...]Q  denotes  the 
function  obtained  from  ...  by  cutting  off  the  negative  frequency  terms  from 
its  Fourier  series.  The  first  equation  yields  W(  * )  =  {$(•)}  ,  which  is  wrong 

since  $  need  not  be  invertible.  Our  heuristic  procedure  is  thus  untenable, 
but  it  reveals  that  the  determination  of  is  tied  up  with  the  possibility 
of  inverting  the  generating  function  $  . 

To  investigate  the  invertibility  of  $,  we  first  note  that  its 
degeneracies  stem  from  a  constant  matrix,  as  the  following  canonical 
factorization  given  in  [  16,  3. 1]  and  also  [  22,  3.  6]  makes  clear: 

13.  3  Thm.  The  generating  function  $  of  any  q-ple^  non-deterministic 
SP  is  expressible  in  the  form  £2  (•  WG,  where  G  is  the  prediction  error 
matrix  with  lag  1,  and  £2  c  L0+(C)  is  invertible  a.  e.  (  Leb. ) ,  and  (  2°) 

£2  ( 0)  =  I  .  In  fact 

T  — 

n(e10)  =  J-L+  $(e10)H  =  I  +  £  A  Jekl0  , 

k=l~K‘ 

where  J  and  H  are  as  in  (  4. 10) -(  4. 12) . 


Since,  J,  H,  $  are  uniquely  determined  by  the  SP  (cf.  (4.9),  (9.3), 
et  seq. ) ,  so  is  £2  .  In  fact,  as  the  writer  showed  in  [17,  2.  2] ,  its  inverse 
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fl“V  )  =  {n(  • )  }~l  is  the  isomorph  in  L,  w  of  the  (non-normalized)  innovation 
d 

£0  in  the  purely  non-deterministic  case,  i.  e. 

2n  , 

(13.4)  2n  '  /  e'n18  n’V^EtdQlf  0,  Q'  «  L2)M  . 

Since  hQ  =  H  g  Q,  its  isomorph  W  is  of  course  H £2  *  =  (£2H  S  *  . 

From  13.  3  it  easily  follows  that  Q  H_1  =  J  -1  +  «  .  Thus,  for  a  purely  non- 
deterministic  SP  we  find  that 

(13.5)  W(elG)  =  {J1+  ^(e10)}"1  in  L2>M  . 

For  q  =  1  we  know  the  corresponding  result  for  any  (purely  or  impurely) 

non-deterministic  SP,  viz . 

i  A  i  A 

(13.6)  W(  e  }  =  xA(e)/*(e1D)  in  L  M 

where  A  is  any  subset  of  [0,  2tt]  such  that  A,  A'  are  carriers  of  the 

z  z 

(  mutually  singular)  measures  |e(*)uq|  ,  |  E(  • )  vQ  |  ,  uQ,  vQ  being  as  in 
the  Wold  decomposition.  But  for  q  >  1  the  difficulties  caused  by  rank 
deficiencies  and  the  failure  of  concordance  (§11)  have  prevented  so  far  the 
discovery  of  a  full-fledged  generalization  of  (13.6) . 


Inserting  the  value  of  W  given  by  (13.  5)  into  the  heuristically 


obtained  equation  ( 13.  2),  we  get 


(13.7)  Yf(e18|  =  [e"Vl%(e1B)]0+  {j-1  +  *( e1®)  }_1  in  L2M  . 

This  equality  was  proved  for  purely  non-deterministic  SP' s  of  full  rank 


q  in  [  36,11,  4. 11] ;  a  slight  variation  of  the  arguments  used  therein  shows 


-36- 


#637 


its  validity  for  1  <  p  <  q  .  The  equations  ( 13. 1) ,  (13.5),  ( 13.  7)  thus 
yield  spectral  expressions  for  the  predictor  and  for  the  innovations  of  a 
purely  non-deterministic  SP  . 

We  must  next  investigate  the  expressibility  of  the  predictor  f  directly 
in  terms  of  the  f  k  >  0  .  For  this  we  must  appeal  to  a  most  basic 
property  of  the  generating  function,  established  by  the  writer  [19,  2.  9],  viz. 
its  optimality  (  ^ )  : 

13. 8  Basic  Lemma  (  a)  The  generating  function  $  of  a  q-ple,  non- 
deterministic  SP  of  any  rank  p  is  an  optimal  function  in  L^C),  i.e. 

(1)  f +(0)  >  0 

0  "I-  ^  ^ 

(2)  ^  c  L  ^  ( C)  &  =  $$  ,  a.  e.  ( Leb. )  on  C 

==*>  ^{*+(0)**(0)}<  *+(0)  . 

(b)  In  case  p  =  q,  we  have 

1  2lT  i0+ 

det  $+(z)  =  exp  {—  /  log  det n/ F'(6) dB},  |z|<l  . 

0  e  -z 

Now  let  us  confine  attention  to  purely  non-deterministic  processes 

of  full  rank.  By  13.8(b),  the  holomorphic  (  matrix-valued)  function  $  on 

the  inner  disk  D+  : 

00 

$+(z)=£ckzk,  Z£  D+,  where  Ck=kk\lG  , 
k=0 


is  invertible  at  each  z  £  D+,  and  hence 


00 

{$  (  z)  }  1  =  L  D,  zk,  z  £  D  , 
"+  k=0  K  + 
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where  the  are  matrix  coefficients,  satisfying 

C0D0=I,  C0D1  + 0,8^0,  c0d2  +  c1d1  +  c2d0  =  o,  ... 

It  follows  from  (13.  7)  that  Y  has  a  holomorphic  extension  (Y  )  ,  to 

—  V  —  V  T 

D+  given  by 

(Y  )  (z)  £  C.  zk  •  £  D  zk  =  £  E  zk,  z  <  D  , 

-v+  k=0_k  k=0  Vk  + 


where 


(13.9) 


E  ,  =  X  C  D  .  . 
-  vk  ^Q-v+j-k-} 


This  suggests  that  in  some  sense  we  should  have 

k«=0 

But  in  general  Y  ^  4  L^(  C) ,  and  the  E  will  not  be  the  ordinary  Fourier 

coefficients  of  Y  .  There  will,  however,  be  circumstances  under  which 

A  kift  jf)  22 

(13.10)  L  Evke  )  in  hz  M  topology  (  ),  as  n  —  °°, 

—A  J  “ 


and  hence  by  our  Isomorphism  Thm.  7.  3, 


(13.11) 


Yj  E  ,  f  .  — f  in  m  _ ,  as  n 

,LJn  vk  -k  v  -  oo 7 

k=0 


In  short,  there  are  processes  for  which  the  coefficients  E  ^  given  in 
( 13.  9)  provide  an  autoregressive  representation  for  the  predictor,  to  wit 


(13.12) 


f  =  fj  E  f  . 
-  v  H _  -  vk  -  -k 


-38- 


#637 


oo 


For  such  processes  we  call  (E  )  Q  the  time-domain  weighting  sequence 

A 

for  the  predictor  f  . 

The  papers  [  36,  II;  21]  are  devoted  largely  to  the  demarcation  of  processes 
for  which  the  equivalent  conditions  ( 13. 10)  -( 13. 12)  prevail.  No  complete 
characterization  has  been  obtained  so  far  —  only  sufficient  conditions,  the 
best  being  perhaps  the  one  in  [  21,  5.  2] : 

F  is  absolutely  continuous  on  [0,2n]  , 


(13.13) 


F'«  LJO^tt]  &  (F')_1€  L ,[  0,  2tt]  . 


-ll 


Also  sufficient ,  of  course,  is  the  stronger  boundedness  condition,  of  greater 
practical  interest,  cf.  [36,11,5.1  &  7.3]: 

F  is  absolutely  continuous 


(13.14) 


i0 


XI  <  F'(  e  )  <  X'  I  a.  e.  (  Leb. ) ,  where  0  <  X<  X'  <<» 


It  is  easy  to  show  that  the  autoregressive  relation  ( 13. 12)  is  equiv¬ 
alent  to  the  discrete  matricial  Hopf-Wiener  equation: 


00 


(13.15) 


r,  =  X  E  ,r  n  >  0  . 

-n+v  f-'  -vk-n-k’  — 
k=0 


The  continuous  parameter  analogues  of  (13.12),  (13.15)  are,  for  a  given 
real  lag  h  >  0  , 

00 

(13.12')  £.h  =  /  dEh(T)-f.T  . 


0 

00 


3.15'  ) 


r(t+h)  =  /  dE  (t)*  r(t-r),  t  >  o  , 


0 


#637 


-39- 


where  the  weighing  E^(  •)  is  a  qXq  matrix-valued  function  of  bounded 
variation  on  [0,°°)  .  These  are  the  equations  with  which  Wiener  began  the 
subject  of  multivariate  prediction  [35,Ch.IV].  He  showed  that  in  simple 
cases  of  practical  interest  the  weighting  E^(  • )  can  be  found  by  solving 
the  matricial  Hopf-Wiener  equation  (13.15')*  We  now  see  that  his  pioneering 
work  belongs  to  a  rather  late  chapter  of  the  general  theory. 


14.  Determination  of  the  generating  function  from  the  spectral  density. 

00 

Given  the  covariance  bisequence  (r  )  or  equivalently  the  spectral 

—  —  00 

density  F'  of  a  q-ple,  purely  non-deterministic  SP  of  full  rank,  the 
determination  of  its  generating  function  $  is  of  great  importance  for  pre¬ 
diction.  This  is  because  once  $  or  its  Fourier  coefficients  C.  are  found 

-  k 

we  can  get  $  *  and  its  Taylor  coefficients  D^,  and  thereafter  the 

crucial  function  Y  and  its  coefficients  E  .  required  to  determine  the 

-  v  -  vk 


predictor  f  cf.  (13.7),  (13.9),  (13.1),  (13.12). 

In  the  case  q  =  1  $  can  be  found  in  principle  from  the  equation 


1  2tt  i0  + 

(14.1)  *(  z)  =  exp  {t^  /  log^F’(9)de},  |z|<l, 

0  e  -z 


and  its  coefficients  C,  found  from 

k 


oo 


( 14.  2) 


y  kie  ,a0  y  kiG  •> 

h  C  e  =  exp{—  +  a.  e  )  , 

k=0  K  1 


th 


where  a  is  the  k  Fourier  coefficient  of  log  F' 

J\ 


These  are  canonical 
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expressions  for  optimal  Hardy  class  functions  in  terms  of  the  norms  of  their 
boundary  values.  But  for  q  >  1  analogous  expressions  are  not  available 
because  the  exponential  law,  exp  ( A  +  B)  =  exp  A  •  exp  B,  fails  for 
matrices.  In  fact,  no  general,  closed-form  expression  for  $  or  for  its 
leading  coefficient  \ lG  in  terms  of  F'  is  known  for  the  cases  q  >  2  . 

Its  discovery  would  be  a  major  contribution. 

Fortunately,  we  do  have  an  infinite  series  expansion  for  $  and  G  in 
terms  of  F'  in  case  the  conditions  (13.13)  are  met,  i.e.  for  the  only  known 
case  in  which  the  predictor  has  an  autoregressive  representation  (13.12), 
cf.  [  21,  4.  7] .  Since  an  explanation  of  this  result  and  its  proof  would 
require  a  digression,  we  shall  only  describe  how  with  its  aid  the  crucial 
weighting  coefficients  E  may  be  computed  from  the  r  .  For  simplicity 

*  VK  “  K 

we  shall  assume  that  our  SP  satisfies  the  stronger  boundedness  condition 
( 13. 14)  rather  than  ( 13. 13) .  For  details  the  reader  should  see  the  papers 
[36, II, §6;  21,  §§4,5]. 

Knowing  the  covariances  and  the  bounds  A!  in  (13.14),  we 

23 

first  obtain  the  slightly  modified  coefficients  (  ) 


Co'I> 


r' 

-n 


n  *  0  . 


We  then  compute  AQ,  A^,  h^, .  . . ,  where  AQ  =  I ,  and  for  m  >  0 


'A 
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oo 


oo  oo 


(14.3) 


A  = 
'm  d 


-r  +  T,  r 1  r'  -  Z  L  r  r  r  +  • 

-m  u  .  -n  -  m-n  u.  ,  -p-n-p  -m-n 
n=l  n=lp=l 


We  next  find  BQ,  B  B  2,  ...  from  the  recurrence  relations 

-  0-  0  =  -0-1  *-1-0  =  "’  -  0  -  2  +  -1-1  +  -2^0  =  -’  *  ’  *  * 
Since  AQ  =1,  this  computation  does  not  involve  matrix  inversion.  Finally, 
for  any  given  v  >  1,  we  compute  the  coefficients  E  E  ,,  E  ^,  . . . 
from 


( 14.  4) 


-He  =  k  -°  ’ 


j=0 


There  are  the  weighting  coefficients  required  in  the  autoregressive  series 

* 

(13.12)  for  the  predictor  f  ^  .  To  complete  the  solution  of  the  Prediction 
Problem  2. 16,  we  must  find  the  prediction  error  matrix  G  ^  for  lag  v  . 
For  this,  we  first  compute  the  crucial  matrix  G  from 

(14.5)  g  =  ttIt  Z  Z  a  r  a*  , 

-  Wn=0m=0-m-n-m-n 


and  then  G  from 
-  v 


(14.6) 


v1  * 

G  =  L  B.GB. 
v  k,0  ‘  k  '  -  k 


The  practical  utility  of  this  scheme  of  computation  will  be  discussed 
in  §15.  In  it  the  generating  function  $  has  been  by-passed.  But  if  $ 
is  wanted,  its  Fourier  coefficients  can  be  found  from  the  equations 


(14.7) 


•C.  =  B ,  V G 

-  k  -  k  - 
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15.  The  factorization  of  matricial  rational  spectral  densities 


00 

In  most  practical  problems  of  prediction  the  q-ple  SP  (f  )  ^  has  a 

24 

spectral  density  F'  on  C,  (  )  which  can  be  analytically  continued  to 

the  entire  complex  plane  C  to  yield  a  matrix  rational  function,  i.e.  one 
whose  entries  are  complex-valued  rational  functions.  Such  a  spectral 
density  is  said  to  be  rational .  Retaining  the  symbol  F*  for  the  extension 
to  C,  we  have 

(15.1)  F'(z)  =^j  P(z),  2  e  C 

where  P  is  a  qXq  matrix  polynomial  and  p(  • )  a  complex  polynomial. 
We  infer  easily  that  there  is  an  integer  r,  1  <  r  <.  q,  such  that  rank 
F*  (  z)  <  r,  for  any  z,  and  rank  F'  (  z)  =  r  except  for  a  finite  number  of 
z  .  We  call  r  the  rank  of  F'  . 


o 

A  basic  result,  known  generally  but  properly  enunciated  and  proved  by 

Polyak  and  Rosanov,  cf.  [30,  Lemma  3;  31,  Ch. 1, 10.  2],  is  that  a  (matricial) 

rational  spectral  density  F'  admits  a  factorization 

i0  iG  *  i  0 

(15.2)  F'(e  )  =  *(e  )tf  (e  ),  0  <  0  <  2tt  , 

where  €  L^+(C)  and  ^  is  rational,  i.e.  its  analytic  continuation  to 

C  is  rational.  It  follows  from  Rosanov' s  Thm.  9.  7  that  a  q-ple  process 
with  a  rational  spectral  density  of  rank  r  >  0  is  purely  non-deterministic 
and  of  rank  r  .  We  owe  to  Rosanov  [  30,  Thm.  7J  the  proof  that  its  generating 
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function  $  itself  has  a  rational  extension  R  to  C ,  and  moreover 


(15.3)  rank  R(  z)  -  r,  z  c  D+  .  ( 25) 

Thus  by  Thm.  9.  6 

(15.4)  F'(  z)  =  R(  z)  R#<(  z)  ,  z  t  C  . 

To  carry  out  the  prediction  for  lag  v  for  our  process  when  it  has  full 
rank  q,  we  must  find  R  and 

(15.5)  Yv(ei0)  =[e"Vi0R(eie)]o+  {R(  eiS)  }_I  , 

cf.  §14  and  ( 13.  7) .  The  methods  proposed  for  this  fall  into  two  broad 

categories:  ( i)  algebraic,  (ii)  analytic. 

( i)  An  algebraic  method  has  been  proposed  by  Yaglom  [41 ,  §1]  for 

continuous  time  processes,  in  which  R  is  by-passed,  and  found 

10 

directly.  In  the  discrete  version,  it  is  assumed  that  rank  F'(e  )  =  q  for 
MI  0,  so  that  condition  ( 13. 14)  is  fulfilled.  Since  the  Y^  in  (15.5)  is 
rational,  Yaglom  starts  out  with  a  rational  function  Y  with  undetermined 
coefficients,  and  shows  that  these  can  be  found  from  the  conditions 
Y  v(  z)  is  holomorphic  for  z  «  D+ 

{z  V  I_  -  Y  (  z)  }  F'(  z)  is  holomorphic  for  ze  D 

The  first  of  these  is  obvious  from  ( 15.  5),  and  the  second  is  just  a  spectral 
paraphrazing  of  the  condition  f  -  f  j.  f  , ,  k<0  .  Yaglom  attacks  Y 

™  ^  V  rv  "  V 

row  by  row,  and  obtains  a  system  of  linear  equations  for  the  unknown 
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coefficients  of  .  He  also  adapts  this  method  to  prediction  on  the  basis 
of  a  bounded  time-interval  in  the  past,  [  4],  §2] . 

Youla  [42]  has  given  an  algebraic  technique  for  carrying  out  the  factori¬ 
zation  ( 15.  4)  even  for  r  <  q  ,  again  for  continuous  time,  based  on  the 
diagonalization  of  polynomial  matrices  by  elementary  transformations, 
cf. ,  [  5,  p.  139] .  In  the  discrete  case  we  get  from  ( 15. 1) 

r<z>  •  ?(z)  •  c-2{z) 

where  D(  • )  is  a  diagonal  matrix  polynomial  and  C  * )  ,  C  2(  * )  are 
matrix  polynomials  with  constant-valued  determinants.  By  exploiting 
the  hitherto  unused  fact  that  F'  is  a  spectral  density,  Youla  shows  that 
the  last  factorization  can  be  brought  into  the  form  ( 15.  4) ,  where  R  is  a 
qXr  rational  matrix,  holomorphic  and  left-invertible  on  D+  .  (A  slight 
variation  of  his  technique  would  yield  a  q  X  q  rational  R  of  rank  r.  ) 

In  effect,  Youla  proves  a  factorization  theorem,  but  his  proof  is  constructive 
and  provides  linear  algebraic  equations  for  the  determination  of  R  . 

Wiener's  original  approach  [35,Ch.  IV]  may  be  classified  as  analytic- 
cum-algebraic.  To  solve  the  Hopf-Wiener  equation  ( 13. 15' )  a  Fourier 
analytic  technique  is  to  be  used,  which  leads  to  the  rational  factorization 
problem  (15.4) .  But  to  solve  this  problem  an  algebraic  method  is  proposed, 
cf.  [  35,  p.  108]  . 

(ii)  The  only  purely  analytic  method  known  to  us  is  the  one  outlined 
in  §14.  This  will  work  when  F'  satisfies  condition  (13.13),  i.e.  for 
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rational  F',  only  when  det  F'(e  )  >  0  for  all  0  .  This  method  has  been 
adapted  to  continuous  time  processes  by  Wong  and  Thomas  [39],  who  also 
point  out  some  short  cuts  in  case  F'  is  rational.  But  their  paper  contains 
several  obscurities.  This  question  of  the  extension  of  the  factorization 
algorithm  to  continuous  time  is  also  the  subject  of  a  recent  dissertation  of 
H.  Salehi  [33],  In  many  practical  situations,  we  would  expect  "weak 
memory",  i.e.  r^  =  0  for  |k|  >N.  In  such  cases  F'  would  be  a 

o 

matrix  trigonometrical  polynomial  (i.e.  a  rational  function  with  poles  only 
at  0  and  00 ) .  Each  series  on  the  right-hand  side  of  (14.  3)  would  then 
terminate  as  would  the  series  ( 14.  5) ,  and  the  method  would  gain  in 
efficiency. 

Which  of  these  methods  for  finding  R  and  is  best  suited  to  the 
modern  digital  computer?  A  weakness  of  the  analytical  technique  of  §14  is 
the  occurrence  of  alternating  signs  in  the  series  (14.  3)  resulting  perhaps 
in  slow  convergence.  On  the  other  hand,  the  algebraic  techniques  that 
have  been  proposed  involve  the  solution  of  large  systems  of  linear  equations, 
and  it  is  not  clear  to  the  writer  if  they  are  generally  more  efficient.  A 
comparative  study  of  the  effectiveness  of  all  these  methods  on  the  computer 
would  be  very  useful.  Some  interest  has  been  aroused  recently  in  this 
question  because  of  its  bearing  on  the  discrimination  of  seismic  signals 
due  to  different  types  of  subterranean  disturbances,  cf.  [28],  [40].  With 
a  slight  smile  one  may  say  that  an  answer  could  even  contribute  to  world 
peace. 
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FOOTNOTES 


1. 


2. 


A  precise  rendition  of  this  statement  will  follow  in  a  moment. 

In  calling  a  covariance  matrix  we  are  assuming  tacitly  that  each 

Elf1)  =0  .  This  assumption  entails  no  real  loss  of  generality  since 
n 

our  SP  is  stationary  and  therefore  the  vector  E(  f  )  is  independent 

00 

of  n  .  Alternatively,  we  may  allow  our  SP  (£n)  to  be  non- 

1 

stationary  .  assuming  only  that  ( 1.  4)  holds  for  f^,  where 


1  i 

f  ‘=f‘-E(r)  . 
n  n  n 


3.  Our  usage  of  bold  face  letters  is  as  follows:  f ,  g,  etc.  denote  members 

q  q 

of  Ji  and  H}  ,  h  denote  subsets  of  H  .  A,  B,  etc.  denote  qXq 
matrices  with  complex  entries,  and  $,  ,  etc.  denote  qXq  matrix¬ 

valued  functions. 

4.  We  write  A  >  B  or  B  <  A  to  mean  that  the  matrix  A-B  is  non-neq;t  . 

❖ 

definite.  A  denotes  the  adjoint  of  A  . 

5.  For  G  C  Hq,  <S<G)  denotes  the  smallest  subspace  of  Ji^  containing  G  . 

N.  B.  Linear  combinations  must  be  taken  with  matrix  coefficients. 

The  symbol  =  should  be  read  "equals  by  definition".  We  shall 
d 

often  use  it  to  introduce  previously  undefined  expressions. 

6.  U  can  of  course  be  extendeu  (  non-uniquely)  to  a  unitary  operator  on 
Ji  onto  j| . 

7.  Rstr.pA  denotes  the  restriction  of  the  operator  A  tot  e  subset  D  of  its 
domain. 

8.  For  q  =  1,  it  was  first  proved  by  Wold  [38]  in  1938,  and  extended  to 

-c,2-  #637 


9. 


continuous  time  by  Karhunen  [11]  and  Hanner  [  7] .  For  q  >  1,  it  was 
conjectured  by  Zasuhin  [  43] ,  and  proved  in  the  full  rank  case  by  Doob  [4] 
and  in  general  by  Wiener  and  the  writer  [36,1] .  The  present  method  of 
obtaining  it  is  given  in  [  22,  3. 1] . 

The  Euclidean  norm  |  A  |_  of  a  matrix  A  =  [a..]  is  defined  by 

~  L  -  1J 


&  A 


I  a|£  =  trace  A  A*  =  ^  ^  |  a. ,  I 


i=l  j=l 


i] 


10.  and  hence  by  stationarity  for  all  integers  n  (  and  also  n  =  00  )  . 

00  00 

11.  With  n  =  00  (4.19)  reads:  (u  )  is  subordinate  to  (f  )  , 

-  n  -oo  -  n  -oo 

cf.  §8  below. 

12.  With  the  probabilistic  interpretation  of  M ,  viz.  H=L2(fi,3,  P) ,  | 

becomes  a  q-variate  random  measure  over  ( [  0,  2n] ,  B  ) ,  but  with  the 
nice  property  that  the  q-variates  corresponding  to  disjoint  Borel  sets 
are  un correlated. 

13.  Actually  Rosenberg  takes  rectangular  $ ,  ^  in  Def.  (6.3),  of  sizes 

pX  q  and  qXr,  respectively,  and  his  result  [  32,  3.  9]  applies  to  all 

the  L_  spaces  obtained  with  different  choices  of  p  . 

-  2,  M 

14.  See  Footnote  5  for  the  meaning  of  this  symbol  (g  . 

15.  Unfortunately  there  is  no  published  work  which  treats  the  results  of  this 
section  from  our  point  of  view.  A  treatment  from  a  somewhat  different 
standpoint  is  available  in  Rosanov  [  31,  Ch.  I,§  §7,  8] . 

16.  adj  A  denotes  the  ad  jugate  matrix  of  A,  i.e.  the  transpose  of  the 

matrix  formed  by  the  cofactors  of  A,  so  that 

A  •  ( adj  A)  =  ( det  A)I  =  ( adj  A)  •  A  . 
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17.  That  every  matricial  distribution  F  possesses  such  parts  F  ,  F, 

”  “3  — D 

is  a  celebrated  result  of  Cramer  [  3,  §4,  Thm.  2] .  See  [19, 1. 1]  for  a 

formulation  of  this  result,  especially  suitable  for  our  purposes. 

•  •  *  1 

18.  M  p  denotes  the  pXp  minor  of  F'  made  up  of  the  rows 

V**’  p 

i,,...,i  ,  and  the  columns  j. ..... j 

1  7  p'  1  p 

19.  For  convenience  we  have  transplanted  the  functions  W,  Yy  from 
[  0,  2tt]  to  the  circle  C  =  [Ul  =!]• 

20.  yj/  +  denotes  the  holomorphic  extension  to  D+  =  [  |  z  |  <  1]  of  a  function 

0+ 

in  L2  (C)  . 

21.  The  word  "maximal"  is  used  instead  of  "optimal"  in  the  English  trans¬ 
lations  of  the  Russian  literature,  in  which  a  less  explicit  but  related 
result  appears, cf.  [30, Thm.  4],  For  q  s  1,  the  word  "outer"  has  been 
used  by  Beurling  [  2]  and  his  disciples. 

22.  i.e.  since  the  spectral  distribution  F  is  absolutely  continuous, 

/  £  I  eki0-Y  (e10)  |-s/F' (0)d6 -0,  as  n—  . 

0  k=0  VK 

23.  Obviously,  is  the  kth  Fourier  coefficient  of  -I_  . 

24.  It  is  now  convenient  to  transplant  F'  from[0,2iT]  to  C  . 

25.  Rosanov' s  proof  can  be  simplified  by  appealing  to  the  generalization  of 
the  classical  canonical  factorization  of  Hardy  class  functions  to  matrix¬ 
valued  functions  of  the  Hardy  class  H,  on  D  given  in  [18,  20,22]. 
This  generalization  employed  prediction  theory  as  well  as  Potapov' s 
important  work  [  25] ,  and  illustrates  how  the  subject  has  ramified. 


